Mining approximate patterns with frequent locally optimal occurrences

نویسندگان

  • Atsuyoshi Nakamura
  • Ichigaku Takigawa
  • Hisashi Tosaka
  • Mineichi Kudo
  • Hiroshi Mamitsuka
چکیده

We propose a novel frequent approximate pattern mining that suits estimation of occurrence regions. Given a string s, our mining enumerates its substrings that locally optimally match many substrings of s. We show an algorithm for this problem in which candidate patterns are generated without duplication using the suffix tree of s. This problem can be extended to the problem of enumerating approximate frequent subforests of a given ordered labeled tree T . Our mining was applied to the task of extraction of search result records from a web page returned by a search engine, and had good performance for benchmark data sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

REAFUM: Representative Approximate Frequent Subgraph Mining

Noisy graph data and pattern variations are two thorny problems faced by mining frequent subgraphs. Traditional exact-matching based methods, however, only generate patterns that have enough perfect matches in the graph database. As a result, a pattern may either remain undetected or be reported as multiple (almost identical) patterns if it manifests slightly different instances in different gr...

متن کامل

Relationship-aware sequential pattern mining: results on medical practise on antibiotic treatment and resistance development

Relationship-aware sequential pattern mining is the problem of mining frequent patterns in sequences in which the events of a sequence are mutually related by one or more concepts from some respective hierarchical taxonomies, based on the type of the events. Additionally events themselves are also described with a certain number of taxonomical concepts. We present RaSP an algorithm that is able...

متن کامل

Relationship-aware sequential pattern mining

Relationship-aware sequential pattern mining is the problem of mining frequent patterns in sequences in which the events of a sequence are mutually related by one or more concepts from some respective hierarchical taxonomies, based on the type of the events. Additionally events themselves are also described with a certain number of taxonomical concepts. We present RaSP an algorithm that is able...

متن کامل

Mining Approximate Frequent Patterns from Graph Databases

Graph analytics is the process of discovering patterns and insights from data that can be modeled as graphs. Algorithms for graph analytics fall into two broad categories : Mining and Management. Graph mining algorithms are often used in graph management and vice versa. In recent times, these algorithms have become an indispensable tool for analyzing networks in domains such as i) Computational...

متن کامل

Distributed Discovery of Multi-Level Approximate Process Patterns

Process mining focuses on the discovery of knowledge about a (business) process from a set of its executions stored in an event log. Each event describes an activity and its performer. Process mining techniques allows automatically extracting the process model that gains insight into various perspectives, such as the control flow perspective, data, and organizational perspective. In this paper,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Discrete Applied Mathematics

دوره 200  شماره 

صفحات  -

تاریخ انتشار 2016